Using XPath with the XmlDocument class in C#

In the last chapter, we used the XmlDocument class to get information from the XML file. We did this by using a series of calls on the hair nodes property, which was a bit simpler because the example was very easy for the readability of our code, it was not very good, so in this chapter we will consider another approach, which Form is more powerful and easy to read and maintain. The technique we use for this is called XPath and is maintained by the same organization that creates XML standards. XPath is actually a complete query language, which has lots of possibilities, but since it is not the XPath tutorial, we will only examine some basic questions. However, even in its simplest ways, XPath is still powerful, as you will see in the following examples.

There are several ways in the XMLDCast class, which are taken as an XPath query parameter, and as a result resulted in XmlNode (S) in this chapter we will see in two ways: SelectionSyncNnode () method, based on the provided XPath query Returns a single XMLNode, and the selection nodes () method, which provides the XMLNode objects to XMLNode objects which are based on the ambient query.

We will try both of the above methods, but instead of the currency information we have examined XML in the previous chapters, we will now try a new XML source. RSS feeds are essentially XML documents created in a specific manner, so that the load of different news readers can be allowed to parse and show the same information in their own way.

We will use RSS feed from CRNN, which is located on http://rss.cnn.com/rss/edition_world.rss with news from around the world. If you open it in your browser, then your browser can present it well, allowing you to view and subscribe to the feed, but do not be fooled by it: Under the hood, this is just XML is what you will see if you do "view source" in your browser. You will see that the basic element is called "RSS". The RSS element generally has one or several "channel" elements, and within this element, we get information about the "item" nodes along with the feed Which are usually the news items we want.

In the following example, we will use the SelectSingleNode () method to get the title of the feed. If you look at XML, you will see that the <title> element <element> element as an element of <title> element, which is then a fundamental element of the <rss> element, root that query in XPath Can be described as:

// RSS / Channel / Title

We only write the names of the elements that we see are separated from the forward-slash (/), which states that the first element of the element next-slash should be a child. Using this XPath is as simple as:


using System;
using System.Text;
using System.Xml;

namespace ParsingXml
{
    class Program
    {
        static void Main(string[] args)
        {
            XmlDocument xmlDoc = new XmlDocument();
            xmlDoc.Load("http://rss.cnn.com/rss/edition_world.rss");
            XmlNode titleNode = xmlDoc.SelectSingleNode("//rss/channel/title");
            if(titleNode != null)
                Console.WriteLine(titleNode.InnerText);
            Console.ReadKey();   
        }
    }
}

We use the SelectSingleNode() method to locate the <title> element, which simply takes our XPath as a string parameter. We then check to make sure that it returned a result, and if it did, we print the InnerText of the located node, which should be the title of the RSS feed. 

In the next example, we will use the SelectNodes() method to find all the item nodes in the RSS feed and then print out information about them:


using System;
using System.Text;
using System.Xml;

namespace ParsingXml
{
    class Program
    {
        static void Main(string[] args)
        {
            XmlDocument xmlDoc = new XmlDocument();
            xmlDoc.Load("http://rss.cnn.com/rss/edition_world.rss");
            XmlNodeList itemNodes = xmlDoc.SelectNodes("//rss/channel/item");
            foreach(XmlNode itemNode in itemNodes)
            {
                XmlNode titleNode = itemNode.SelectSingleNode("title");
                XmlNode dateNode = itemNode.SelectSingleNode("pubDate");
                if((titleNode != null) && (dateNode != null))
                    Console.WriteLine(dateNode.InnerText + ": " + titleNode.InnerText);
            }
            Console.ReadKey();   
        }
    }
}

The SelectNodes () method takes an XPath query as a string, as we saw in the previous example, and then returns a list of XmlNode objects in the XmlNodeList archive. We repeat it with a foreach loop and then each item nodes, we ask for a child node called title and pubDate (published date) directly using the silencing node () on the item node. If we meet both of them, then we print the date and title on the same line and then move on.

In our example, we wanted two different values ​​from each item node, which is why we asked for item nodes and then processed on each of them. However, if we only need the title of each item, we can change the XPath query in something like this:

//rss/channel/item/title 

It will match each title node in each of the item nodes. Here's the query with some C# code to make it all happen:


XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load("http://rss.cnn.com/rss/edition_world.rss");
XmlNodeList titleNodes = xmlDoc.SelectNodes("//rss/channel/item/title");
foreach(XmlNode titleNode in titleNodes)
    Console.WriteLine(titleNode.InnerText);            
Console.ReadKey();